Enterprise Database Systems
Big Data Concepts and Terminology
Big Data Concepts: Big Data Essentials
Big Data Concepts: Getting to Know Big Data
Final Exam: Big Data Infrastructures

Big Data Concepts: Big Data Essentials

Course Number:
it_dlbdcdj_02_enus
Lesson Objectives

Big Data Concepts: Big Data Essentials

  • discover the key concepts covered in this course
  • describe how to add structure to raw data and name big data tools that aid this process
  • describe the difference between data warehousing and big data and specify the impact that big data has had on data warehousing
  • compare and contrast parallel and distributed computing systems
  • describe the difference between horizontal and vertical scaling and specify why horizontal scaling is the best choice with respect to big data
  • describe the Hadoop system and name its main features, benefits, and use cases
  • describe the subcomponents of Hadoop, such as MapReduce and HDFS
  • specify the importance of migrating from Hadoop to modern data platforms and briefly describe the migration process
  • compare the functionality and use cases of Hadoop and cloud computing platforms
  • name and describe the features of Hadoop HDFS and identify common in-memory storage systems including Kudu, Elasticsearch, and CockroachDB
  • describe in-memory storage systems and their use cases and advantages using examples
  • summarize the key concepts covered in this course

Overview/Description
Big data analytics, collecting vast amounts of data and transforming it into insights, drives major business decisions everywhere. Managers, decision-makers, data technicians, and data enthusiasts alike benefit from knowing how various systems and technologies are used in big data projects. Use this course to progress from a foundational comprehension of big data analytics to grasping more advanced concepts, like parallel and distributed computing systems and horizontal and vertical scaling. Take an in-depth look at Hadoop's main components and characteristics and how it's used for big data analytics. Then, delve into the various kinds of storage systems used in big data. Upon completing this course, you'll have a greater comprehension of the tools and methods used to execute big data projects.

Target

Prerequisites: none

Big Data Concepts: Getting to Know Big Data

Course Number:
it_dlbdcdj_01_enus
Lesson Objectives

Big Data Concepts: Getting to Know Big Data

  • discover the key concepts covered in this course
  • describe the concept of big data and the history behind it
  • identify the sources that are capable of generating big data
  • define the big 7 characteristics that define big data: volume, velocity, variety, variability, veracity, visualization, and value
  • compare structured and unstructured data and describe how the ability to extract value from unstructured data is important when dealing with big data
  • describe the process of deciphering correlations, market trends, patterns, and customer behavior using big data
  • describe the main advantages of big data analytics, including cost reduction and better decision-making
  • list top domains that are exploring and utilizing big data technologies, including process automation, security, and credit scoring
  • describe how Netflix uses big data to generate billions of dollars in revenue
  • describe how Amazon uses big data to understand customers
  • list and describe five main challenges when dealing with big data
  • summarize the key concepts covered in this course

Overview/Description
Big data analytics has become an essential part of any business dealing with the digital world. The ability to collect large amounts of data and turn it into insights has transformed the world's business landscape. To properly manage projects using such technologies, leaders should at least have a foundational understanding of big data. Use this course to get to grips with the necessary concepts and terminologies you'll need when discussing big data projects. Learn about the primary sources and characteristics of big data. Then, dive into the world of big data analytics - exploring its main advantages, use cases, and significant challenges. When you've finished this course, you'll be able to speak about data-related projects, discussing relevant data infrastructures and architectures confidently.

Target

Prerequisites: none

Final Exam: Big Data Infrastructures

Course Number:
it_fedldm_02_enus
Lesson Objectives

Final Exam: Big Data Infrastructures

  • define the big 7 characteristics that define Big Data
  • define the role of the data processing layer and specify how information captured in the previous layer is processed
  • describe graph database use cases and specify why the relationship between data is as important as the data itself in a graph database
  • describe Spark and how it offers open-source scalable massively parallel in-memory solutions for analytics applications
  • describe the challenges in the current data analytics models and system designs such as scalability, consistency, reliability, efficiency, and maintainability
  • describe the concept of Big Data and the history behind it
  • describe the difference between horizontal and vertical scaling
  • describe the rewarding role of NoSQL databases in horizontal distribution of large, structured and unstructured data
  • describe the subcomponents of Hadoop such as MapReduce and HDFS
  • describe what horizontal scaling is and specify how it eliminates the need for adding more memory to existing machines by using clusters (AKA, Sharding )
  • identify the sources that are capable of generating Big Data
  • list the main characteristics of Spark such as loading behavior, file formats, parallelism, cache, data skews
  • name and describe the features of Storage systems such as HDFS, S3 and Object stores, Elastic Search and Apache Solr, Kudu, CockroachDB
  • name and describe the four types of Big Data Analytics (i.e. Prescriptive, Predictive, Diagnostic, Descriptive)
  • name and describe the role of the main layers of Big data analytics from the bottom to the top
  • name most important performance optimization techniques such as file format selection, level of parallelism and API selection
  • recognize the need for Big Data
  • specify the shortcoming of distributed systems and why these shortcomings make Big Data even more important
  • specify use cases, benefits and challenges of popular key-value data stores
  • specify when to use NoSQL and when to use SQL database

Overview/Description

Final Exam: Big Data Infrastructures will test your knowledge and application of the topics presented throughout the Big Data Infrastructures track of the Skillsoft Aspire Data for Leaders and Decision Makers Journey.



Target

Prerequisites: none

Close Chat Live